AITopics | active camera

Collaborating Authors

active camera

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

XiCAD: Camera Activation Detection in the Da Vinci Xi User Interface

Jenke, Alexander C., Just, Gregor, de Boer, Claas, Wagner, Martin, Bodenstedt, Sebastian, Speidel, Stefanie

arXiv.org Artificial IntelligenceNov-26-2025

Purpose: Robot-assisted minimally invasive surgery relies on endoscopic video as the sole intraoperative visual feedback. The DaVinci Xi system overlays a graphical user interface (UI) that indicates the state of each robotic arm, including the activation of the endoscope arm. Detecting this activation provides valuable metadata such as camera movement information, which can support downstream surgical data science tasks including tool tracking, skill assessment, or camera control automation. Methods: We developed a lightweight pipeline based on a ResNet18 convolutional neural network to automatically identify the position of the camera tile and its activation state within the DaVinci Xi UI. The model was fine-tuned on manually annotated data from the SurgToolLoc dataset and evaluated across three public datasets comprising over 70,000 frames. Results: The model achieved F1-scores between 0.993 and 1.000 for the binary detection of active cameras and correctly localized the camera tile in all cases without false multiple-camera detections. Conclusion: The proposed pipeline enables reliable, real-time extraction of camera activation metadata from surgical videos, facilitating automated preprocessing and analysis for diverse downstream applications. All code, trained models, and annotations are publicly available.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2511.20254

Country: Europe (0.16)

Genre: Research Report (0.40)

Industry:

Health & Medicine (1.00)
Media > Television (0.35)
Media > Photography (0.35)
Media > Film (0.35)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

FLAF: Focal Line and Feature-constrained Active View Planning for Visual Teach and Repeat

Fu, Changfei, Chen, Weinan, Xu, Wenjun, Zhang, Hong

arXiv.org Artificial IntelligenceSep-8-2024

This paper presents FLAF, a focal line and feature-constrained active view planning method for tracking failure avoidance in feature-based visual navigation of mobile robots. Our FLAF-based visual navigation is built upon a feature-based visual teach and repeat (VT\&R) framework, which supports many robotic applications by teaching a robot to navigate on various paths that cover a significant portion of daily autonomous navigation requirements. However, tracking failure in feature-based visual simultaneous localization and mapping (VSLAM) caused by textureless regions in human-made environments is still limiting VT\&R to be adopted in the real world. To address this problem, the proposed view planner is integrated into a feature-based visual SLAM system to build up an active VT\&R system that avoids tracking failure. In our system, a pan-tilt unit (PTU)-based active camera is mounted on the mobile robot. Using FLAF, the active camera-based VSLAM operates during the teaching phase to construct a complete path map and in the repeat phase to maintain stable localization. FLAF orients the robot toward more map points to avoid mapping failures during path learning and toward more feature-identifiable map points beneficial for localization while following the learned trajectory. Experiments in real scenarios demonstrate that FLAF outperforms the methods that do not consider feature-identifiability, and our active VT\&R system performs well in complex environments by effectively dealing with low-texture regions.

active camera, map point, vt&r, (15 more...)

arXiv.org Artificial Intelligence

2409.03457

Country:

Asia > China > Guangdong Province > Shenzhen (0.05)
Asia > China > Guangdong Province > Guangzhou (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Active Gesture Recognition using Learned Visual Attention

Neural Information Processing SystemsApr-6-2023, 18:22:04 GMT

We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Us(cid:173) ing vision routines previously implemented for an interactive envi(cid:173) ronment, we determine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture.

active camera, active gesture recognition, learned visual attention, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Pattern Recognition (0.68)

Add feedback

Decision-Theoretic Coordination and Control for Active Multi-Camera Surveillance in Uncertain, Partially Observable Environments

Natarajan, Prabhu, Hoang, Trong Nghia, Low, Kian Hsiang, Kankanhalli, Mohan

arXiv.org Artificial IntelligenceOct-2-2012

A central problem of surveillance is to monitor multiple targets moving in a large-scale, obstacle-ridden environment with occlusions. This paper presents a novel principled Partially Observable Markov Decision Process-based approach to coordinating and controlling a network of active cameras for tracking and observing multiple mobile targets at high resolution in such surveillance environments. Our proposed approach is capable of (a) maintaining a belief over the targets' states (i.e., locations, directions, and velocities) to track them, even when they may not be observed directly by the cameras at all times, (b) coordinating the cameras' actions to simultaneously improve the belief over the targets' states and maximize the expected number of targets observed with a guaranteed resolution, and (c) exploiting the inherent structure of our surveillance problem to improve its scalability (i.e., linear time) in the number of targets to be observed. Quantitative comparisons with state-of-the-art multi-camera coordination and control techniques show that our approach can achieve higher surveillance quality in real time. The practical feasibility of our approach is also demonstrated using real AXIS 214 PTZ cameras

active camera, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

1209.4275

Country: Asia > Singapore (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

A Decision-Theoretic Approach to Dynamic Sensor Selection in Camera Networks

Spaan, Matthijs T. J. (Instituto Superior Técnico) | Lima, Pedro U. (Instituto Superior Técnico)

AAAI ConferencesSep-19-2009

Nowadays many urban areas have been equipped with networks of surveillance cameras, which can be used for automatic localization and tracking of people. However, given the large resource demands of imaging sensors in terms of bandwidth and computing power, processing the image streams of all cameras simultaneously might not be feasible. In this paper, we consider the problem of dynamical sensor selection based on user-defined objectives, such as maximizing coverage or improved localization uncertainty. We propose a decision-theoretic approach modeled as a POMDP, which selects k sensors to consider in the next time frame, incorporating all observations made in the past. We show how, by changing the POMDP's reward function, we can change the system's behavior in a straightforward manner, fulfilling the user's chosen objective. We successfully apply our techniques to a network of 10 cameras.

objective, pomdp, sensor, (17 more...)

AAAI Conferences

Nineteenth International Conference on Automated Planning and Scheduling

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Massachusetts (0.04)
Europe > Portugal > Lisbon > Lisbon (0.04)

Industry: Commercial Services & Supplies > Security & Alarm Services (0.85)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Communications > Networks > Sensor Networks (0.95)

Add feedback

Active Gesture Recognition using Learned Visual Attention

Darrell, Trevor, Pentland, Alex

Neural Information Processing SystemsDec-31-1996

We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Using vision routines previously implemented for an interactive environment, we determine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture. 1 INTRODUCTION Vision has numerous uses in the natural world.

active gesture recognition, gesture recognition, recognition, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

Active Gesture Recognition using Learned Visual Attention

Darrell, Trevor, Pentland, Alex

Neural Information Processing SystemsDec-31-1996

active gesture recognition, gesture recognition, recognition, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback

Active Gesture Recognition using Learned Visual Attention

Darrell, Trevor, Pentland, Alex

Neural Information Processing SystemsDec-31-1996

We have developed a foveated gesture recognition system that runs in an unconstrained office environment with an active camera. Using visionroutines previously implemented for an interactive environment, wedetermine the spatial location of salient body parts of a user and guide an active camera to obtain images of gestures or expressions. A hidden-state reinforcement learning paradigm is used to implement visual attention. The attention module selects targets to foveate based on the goal of successful recognition, and uses a new multiple-model Q-Iearning formulation. Given a set of target and distractor gestures, our system can learn where to foveate to maximally discriminate a particular gesture. 1 INTRODUCTION Vision has numerous uses in the natural world.

active gesture recognition, gesture recognition, recognition, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)
Asia > Middle East > Jordan (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Vision > Gesture Recognition (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.70)

Add feedback